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Abstract 

Human immunoglobulin heavy chain variable domains (Vh) are promising scaffolds for antigen binding. However, Vh is an 
unstable and aggregation-prone protein, hindering its use for therapeutic purposes. To evolve the Vh domain, we 
performed in vivo protein solubility selection that linked antibiotic resistance to the protein folding quality control 
mechanism of the twin-arginine translocation pathway off. coli. After screening a human germ-line Vh library, 95% of the 
Vh proteins obtained were identified as Vh3 family members; one Vh protein, IVlG2x1, stood out among separate clones 
expressing individual Vh variants. With further screening of combinatorial framework mutation library of IVlG2x1, we found a 
consistent bias toward substitution with tryptophan at the position of 50 and 58 in Vh. Comparison of the crystal structures 
of the Vh variants revealed that those substitutions with bulky side chain amino acids filled the cavity in the Vh interface 
between heavy and light chains of the Fab arrangement along with the increased number of hydrogen bonds, decreased 
solvation energy, and increased negative charge. Accordingly, the engineered Vh acquires an increased level of 
thermodynamic stability, reversible folding, and soluble expression. The library built with the Vh variant as a scaffold was 
qualified as most of Vh clones selected randomly were expressed as soluble form in E coli regardless length of the 
combinatorial CDR. Furthermore, a non-aggregation feature of the selected Vh conferred a free of humoral response in 
mice, even when administered together with adjuvant. As a result, this selection provides an alternative directed evolution 
pathway for unstable proteins, which are distinct from conventional methods based on the phage display. 
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Introduction 

The variable domain of heavy or light chain (Vh or Vl) of a 
human immunoglobulin G (IgG) molecule is the smallest part of 
the antibody that preserves the original binding activity. Although 
variable domains have short serum half-lives and lack effector 
function, their format flexibility by adopting immune cell engaging 
strategy or introducing a long-acting module can ameliorate these 
defects [1-3]. Furthermore, their ability to access occluded or 
hidden epitopes, superior bio-distribution, and cost-effective 
production make variable domains potentially useful in therapeu- 
tic applications for which full IgG molecules are not appropriate 
[4-6]. 

When not assembled with each other, instability problem of Vh 
and Vl of human IgG is a major concern for biotechnological 
applications since Ward et al. reported that such Vh domains are 
relatively sticky resulting in tendency to aggregate [7]. This 
aggregation is primarily due to interactions between hydrophobic 
patches residing at the interface between Vh and Vl- Direct 
replacement of the interfacial hydrophobic residues of Vh or Vl 
with hydrophilic amino acids has been partially successful in 
improving protein stability. Three hydrophilic substitutions 



(G44E/L45R/W47G) improve the solubility of Vh [8-10], but 
these changes also decrease expression yield and thermal stability 
due to the resultant deformations of the P-sheet structure [11-13]. 

In addition to rational mutation strategies, several groups have 
adopted combinatorial approaches to engineer human Vh or Vl- 
Jespers et al. screened a combinatorial CDR library bound to 
protein A for aggregation-resistant Vh, using panning phage 
display under heat-denatured conditions [14]. They found that 
mutations in the CDRs of human Vh can increase solubility and 
promote reversible folding. Without the use of heat denaturation 
in phage display, Barthelemy et al. isolated various mutant Vh 
domains with an increased stability and solubility [15]. To 
eliminate the complicated step involving in vitro protein A panning. 
To et al. selected monomeric human Vh domains directly from 
bacterial lawns by plaque size [16]. These variant techniques 
notwithstanding, most screenings of engineered Vh domains have 
been conducted using phage display and protein A-binding 
activity. 

On the other hand, in vivo genetic selection methods distinct 
from in vitro phage display have been applied in efforts to improve 
protein solubility [17,18]. In one such in vivo method, the twin- 
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arginine translocation (Tat) pathway was exploited as an in vivo 
protein fitness filter for fast folding and solubility of protein of 
interests including single chain Fv [19 21]. However, such 
approaches have not been attempted for Vh or Vl alone. In the 
current study, we applied this system to evolve human Vh toward 
greater stability and characterized the structural hallmarks to 
greater stability and solubility. 

Materials and Methods 

Ethics Statement 

All animal experiments were performed in accordance with the 
guidelines for the care and use of laboratory animals recom- 
mended by the Ministry of Food and Drug Safety of Republic of 
Korea. The experimental procedures were approved by the 
Mogam Animal Care and Use Committee. Currently, Mogam 
Animal Care and Use Committee changed the name as the Green 
Cross Central Research laboratory Animal Care and Use 
Committee, by which the animal experiment closing report was 
reviewed and approved. 

Construction of the Tat-based genetic selection vector 

The vector system for screening of stable Vh domains was 
modified from the previous report [20,22]. Briefly, TEM-1 (3- 
lactamase (BLA^ was ligated with the Tat signal sequence of 
trimethylamine N-oxide reductase (ssTorA) of E. coli in pET9a, 
yielding pET-TAPE (Figure lA). Next, a fusion gene of ssTorA 
with the representative human immunoglobulin heavy chain 
variable domain Vh family type 2 (Vh2) was synthesized 
(GenScript, USA). Vh2 was used as a template for PCR using a 
5' primer (Table SI, primer 1) including an Ndel restriction site 
and a 3' primer (Table SI, primer 2) including a NoK site, a 6 xHis 
tag, and a BamHV site, to yield the jV^^I-ssTorA-VH2-jVb^I-6 xHis- 
BamHV gene. This gene was inserted between the Nde\ and BamHV 
sites in the multi-cloning site of pET9a to yield pET9a-ssTorA- 
Vh2. The JVotl-BLA-BamHl segment was generated by PCR (Table 
SI: primer 3 as sense, primer 4 as antisense) using BLA as a 
template. This gene was inserted between the JVotl and BamHl sites 
of pET9a-ssTorA-VH2, yielding pET9a-ssTorA-VH2-5Z^, which 
was named pET-TAPE. A synthetic or human germ-line Vh 
library was constructed by replacing the Vh2 gene in pET-TAPE. 

Library design and construction 

cDNA for the human Vh library was obtained by reverse 
transcription of mRNAs from the liver, peripheral blood 
mononuclear cells, spleen, and thyroid (Clontech, Madison, WI, 
US) using various primers (Table SI: primers 5 12 as sense, and 
primers 13 15 as antisense). Each of cloned human Vh gene 
family (VhI, Vh3, and Vh5) was inserted between the JVdel and 
BamHl sites of pET-TAPE, yielding a pET-TAPE-Vn library with 
approximately 10^ distinct clones. Mutations were introduced by 
PCR using MG2xl as the template and primers that introduced 
mutations at the first fragment (Table S 1 : primers 1 6 and 1 7) and 
the second fragment (Table SI: primers 18 and 19). Next, MG2xl 
variant genes were synthesized by overlapping PCR of the two 
gene fragments using primers 16 and 19 (Table SI). After digestion 
of the MG2xl variants with JVcol and J^otl, the inserts were cloned 
into pET-TAPE, yielding the frame-mutation Vh library with 
approximately 10^ distinct clones. 

Setup for Tat-associated protein engineering (TAPE) 
system 

Along with the construction of pET-TAPE, the protocol 
implementing a liquid culture and rescuing correct size of gene 



of interests was conducted to screen protein solubility in high- 
throughput manner. The antibiotic resistance of E. coli is 
correlated to the translocation of soluble Vh-BLA fusion protein 
into the periplasm via the Tat pathway. The TAPE system differs 
from previously described systems [20] in that soluble proteins are 
enriched in consecutive rounds of liquid culture with increasing 
concentrations of antibiotic. E. coli T7 Express LysY/I^ was 
transformed with the pET-TAPE-Vn library by electroporation. 
Transformants were cultured in SOC (20 g/1 Bacto tryptone, 5 g/ 
1 Bacto yeast extract, 10 mM NaCl, 2.5 mM KCl, 10 mM MgCls, 
10 mM MgS04, and 20 mM glucose) at 37°C for 1 h, and then 
inoculated and cultured in liquid LB media containing 50 |ig/ml 
ampicillin. When OD (600 nm) reached 0.6, cells were collected 
by centrifugation and plasmid DNA was isolated. To prevent 
enrichment of false-positives in subsequent rounds of selection, 
isolated plasmids were restricted with JVcol and BamHl, and digests 
were subjected to gel electrophoresis to allow size selection of full- 
length Vh-BLA genes. The size-selected Vh-BLA genes were 
cloned between the JVcol and BamHl sites of pET-TAPE, and the 
resultant plasmids were transformed into E. coli. Subsequently, 
liquid culture was performed in repeated rounds with stepwise 
increases in the concentration of ampicillin up to 500 |lg/ml. 
(Figure 2). After performing 3-5 consecutive cycles of liquid 
culture, clones were separated on an LB agar plate containing 
ampicillin and 50 |ig/ml kanamycin. 

Host strains and plasmids 

E. coli T7 Express LysY/F (New England BioLabs, MA, USA) 
was used as the host for the expression of the Vh domains and 
their fusion proteins. pET9a (New England BioLabs, MA, US) was 
used to construct the TAPE system, i.e., for expression of fusion 
proteins of various Vh domains and BLA. pET22b (New England 
Biolabs, MA, USA) was used to express the Vh domain alone. All 
other DNA manipulations were conducted according to common 
methods. 

Fractionation of soluble and insoluble Vh 

To determine the degree of soluble expression, individual Vh 
domains alone (i.e., without the BLA fusion) were expressed in E. 
coli. The soluble and insoluble fractions were separated after 
induction of Vh expression, followed by SDS-PAGE. Soluble and 
insoluble proteins were fractionated in lysis buffer (B-PER 
Reagent, Thermo Scientific, USA). The pellet was washed with 
PBS, and then resuspended in solubilization buffer (pH 7.4, 
50 mM NaH2P04, 6 M urea, 0.5 M NaCl, and 4 mM DTT) to 
obtain the insoluble fraction. Each fraction was prepared from the 
same quantity of cells to allow band intensities to be compared 
after gels were stained with Coomassie blue. 

Circular Dichroism 

Purified Vh domains were diluted to 0.2 mg/ml. The purity of 
Vh domains used for CD measuremnt was demonstrated with 
SDS-PAGE (Figure SI). CD was measured using a spectropolar- 
imeter (Jasco J-715 model, Jasco Inc, Easton, MD, US). T^ was 
defined as the temperature at which a 50% reduction in the 
soluble protein fraction was observed. The profile was recorded at 
a wavelength of 235 nm as the temperature gradually increased 
from 25 to 85°C at a rate of l°C/min. All CD measurement were 
repeated 3 times for each Vh domain. The p-value (paired t-test) 
between two Vh domains was less than 0.005 for all possible pairs 
of the tested Vh domains. 
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Figure 1. Verification of the selection system, TAPE. (A) Plasmid map of pET-TAPE. (B) Average number of ampicillin-resistant colonies from 
cultures harboring constructs for expression of a negative control (no Tat signal sequence, VhB-BLA [-]), positive control (Tat signal sequence and 
reporter gene only, ssTorA-BLA [+]), and published Vh domains (HuCal Vh2, HuCal VhB, Dp47d, and HEL4). Each construct was expressed in LB 
medium containing 50 |ag/ml ampicillin. Cultures were induced by the addition of 1 mM isopropyipD-1-thiogalactopyranoside for 3 h after 
inoculation. After the induction, cultures were spread onto agar plates containing 50 |ag/ml ampicillin for colony counting. Data points are means and 
standard deviation for three independent experiments. 
doi:1 0.1 371 /journal.pone.00981 78.g001 



Recovery yield 

The recovery yield was defined as the level of soluble Vh after 
heat denaturation. After aggregates were removed by centrifuga- 
tion, the concentration of soluble Vh was determined according to 



the equation, c = A/ (E xb), where A is the absorbance at 280 nm, 
E is the molar extinction coefiicient (M Vm b is the pathway 
length (cm), and c is the molar concentration (mol/1). The 
extinction coefficient was calculated using the amino acid 
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Figure 2. Schematic procedure for screening protein solubility using TAPE (liquid screen'). (A) Construction of the pET-TAPE Vh library 
(either germ-line or mutated) and transformation of the library into £ coli. (B) Liquid culture of the library with stepwise increases in antibiotic 
concentration. (C) Collection of plasmids and purification of the intact Vh-BLA coding region. (D) Re-cloning of the Vh-BLA gene into pET-TAPE 
between the Nco\ and BamH\ sites, and transformation into £ coli. Steps (B), (C), and (D) were repeated four times for each ampicillin concentration 
(50, 100, 250, and 500 |ag/ml). 
doi:1 0.1 371 /journal.pone.00981 78.g002 
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composition, assuming that all pairs of cysteine residues were 
involved in disulfide bonds (web.expasy.org/protparam). Protein 
quality was confirmed by size-exclusion chromatography. 

Humoral immune response of mice to the screened Vh 

BALB/c mice (six per group) were intravenously injected with 
10 |ig MG2xl, MG8-14, or Vhh on 3 consecutive days. The 
injections were repeated at weeks 1, 4, and 8. Samples of immune 
sera were obtained every week, and mice were sacrificed at day 65. 
For intramuscular and subcutaneous injections, BALB/c mice (six 
per group) were injected with 1 or 10 |ig of MG8-14 or Vhh- Vhh 
is identical to Vhh #3E, which binds to tumor necrosis factor-cx 
[23]. The injection was repeated every 2 weeks with a total of five 
injections. The mice were sacrificed 2 weeks after the final 
injection. Samples of immune sera were obtained every 2 weeks, 1 
day before the next injection. To measure antibody titers, enzyme- 
linked immunosorbent assays were performed using 96-well plates 
coated with MG2xl, MG8-14, or Vhh, and HRP-labeled goat 
anti-mouse antibody as a secondary antibody, followed by the 
addition of 3,3',5,5'-tetramethylbenzidine and measurement of 
OD (490 nm). 

Results 

Verifying TAPE 

To verify whether TAPE system can discriminate between 
proteins of different solubilities, we applied this system to various 
published Vh domains whose soluble expression levels are well 
known. The Vh domains were cloned into the pET-TAPE vector 
(Figure 1 A), allowing them to be expressed in E. coli as fusions with 
BLA and the Tat signal sequence of ssTorA. The antibiotic 
resistance of strains carrying each construct was measured by 
counting the cell number in cultures containing 50 |lg/ml 
ampicillin. Cells expressing BLA alone (ssTorA-BLA [+], positive 
control) exhibited the highest resistance, and cells expressing 
HEL4 were approximately as resistant as the positive control 
(Figure IB) [22]. Cells expressing the other representative Vh3 
family genes, Dp47d and Vh3 (HuCAL), exhibited resistances 
intermediate between those of the positive and negative controls 



[24,25] . The resistance of cells expressing the antibiotic resistance 
gene with no Tat signal sequence (Vh3-BLA [-], negative control) 
was lower than that of cells expressing any other construct, with 
the notable exception of the Vh2 (HuCAL) construct (ssTorA- 
Vh2-BLA). Cells expressing Vh2 exhibited the lowest ampicillin 
resistance, even lower than that of the negative control. Since most 
of the Vh2 was expressed exclusively as inclusion bodies 
(Figure 3A), the biostatic effect of Vh2 aggregate formation in E. 
coli might have further slowed cell growth beyond the bactericidal 
effect of the antibiotic. 

Screening of human germ-line Vh library via TAPE 

In the Tat-associated screening system using ampicillin- 
containing agar plates, false-positive clones containing small Vh 
peptide fragments were often enriched because such fragments are 
highly compatible with the Tat pathway. To overcome this 
problem, previous screens have included a step to exclude clones 
with excessively high antibiotic resistance (i.e., counter-selection) 
[20]. In this study, to perform Vh solubility screening in a high- 
throughput manner, we enriched antibiotic-resistant clones in 
liquid cultures ('liquid screen') containing various concentrations 
of ampicillin (50-500 |ig/ml) (Figure 2). Furthermore, to avoid 
enrichment of short Vh gene fragments that might yield false- 
positive results, full-size Vh-BLA fusion genes were recovered by 
gel purification. In contrast to the limitation of library size in the 
plate-based method, the liquid screen with a culture larger than 
100 ml can cover library sizes greater than 10^ because 1 ml 
overnight culture of E. coli in the LB with ampicillin contains 
normally about 10^ cells. The size of the human germ-line Vh 
library for TAPE was about 2.17x10^. 

After the third round of TAPE through selection of antibiotic 
resistance, 154 Vh sequences were selected from the human germ- 
line Vh library that had been constructed using primers specific 
for the VhI, Vh3, and Vh5 families. These 154 Vh sequences 
were classified into 19 different Vh family types. Of the 154 total 
Vh hits, 146 (94.8%) were identified as members of the Vh3 
family; this frequency is significantly higher than the Vh3 family 
frequency in the library prior to TAPE (101 Vh3 family members 
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Figure 3. SDS-PAGE of soluble and insoluble fractions of Exco/i, (A) Previously characterized Vh domains (Vh2, VhB, Vh6, DP47d, and HEL4). 
(B) Vh domains chosen randomly from the human Vh germ-line library (RD1-3) or selected from the human germ-line library using TAPE (IVlG4x4-44, 
IVlG4x4-25, MGIO-IO, and IVlG2x1). Cultures expressing each Vh domain were harvested after induction at 25°C for 3.5 h, and soluble (S) and insoluble 
(I) fractions were prepared. Lane 'MW contains a protein size marker; the size of each marker is indicated (in kD) to the left of each panel. In both 
panels, the mobilities of Vh domains correspond to the 15-kD protein size marker. Different parts from separating gels are grouped to align 
expression patterns for soluble and insoluble fraction of each Vh domain. 
doi:1 0.1 371 /journal. pone.00981 78.g003 
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Table 1. Isolated germ-line Vh genes after the third round of TAPE. 



V Q6n6 n3ITt6 


% of Vh after TAPE^ 


% of Vh before TAPE*' 


Fold 

increase^ 


Vh3-7 


6.5 


4.3 


1.5 


Vh3-9 


0.6 


2.1 


0.3 


Vh3-15 


7.1 


3.9 


1.8 


Vh3-21 


1.3 


6.1 


0.2 


Vh3-23 


14.3 


16.0 


0.9 


Vh3-30 


46.8 


1 1.4 


4.1 


Vh3-33 


1.3 


2.9 


0.4 


Vh3-43 


0.6 


0.4 


1.5 


Vh3-48 


7.1 


3.9 


1.8 


Vh3-49 


0.6 


0.4 


1.5 


Vh3-53 


0.6 


1.0 


0.6 


Vh3-72 


0.6 


1.0 


0.6 


Vh3-74 


7.1 


2.5 


2.8 


others 


0 


13.7 




Vh3 family 


94.8 


70.1 


1.4 


Vh1-2 


0.6 


1.4 


0.4 


Vh1-8 


0.6 


0.0 




Vh1-46 


0.6 


3.5 


0.2 


Vh1-69 


0.6 


4.7 


0.1 


others 


0 


9.7 




VhI family 


2.6 


19.4 


0.1 


Vh5-51 


1.3 


4.3 


0.3 


VnS-a 


1.3 


5.7 


0.2 


others 


0 


0.7 




Vh5 family 


2.6 


10.5 


0.2 



^Proportion of each identified Vh gene among the 154 sequences selected after TAPE. 
"^Proportion of each identified Vh gene among 144 sequences randomly selected from the library. 
^Ratio of '% of Vh gene after TAPE' to '% of Vh gene before TAPE'. 
doi:1 0.1 371/journal.pone.00981 78.t001 



out of 144 sequences: 70.1%). Among the Vh3 family genes 
isolated from the germ-line Vh library, the Vh3-30 and Vh3-23 
genes were predominant. On the other hand, the frequencies of 
the VhI and Vh5 families decreased by 0.1 -fold and 0.3-fold, 
respectively. Overall, as a result of TAPE, the Vh3 family was 
enriched 1.4-fold (i.e., from 70.1% to 94.8%), whereas the other 
families became less abundant (Table 1). 

To determine the degree of soluble expression of isolated 
individual Vh domains lacking the BLA fusion, the soluble and 
insoluble fractions were separated after expression of the 
corresponding genes, and their expression patterns were compared 
with those of various Vh domains published previously [24,25]. 
Vh domains randomly selected from the germ-line Vh library 
were expressed predominantly as inclusion bodies (Figure 3B, 
RDl-3), whereas the soluble expression levels of Vh domains 
selected by TAPE, e.g., MG4x4-44, MG4x4-25, MGlO-10, and 
MG2xl, were significantly increased (Figure 3B). Moreover, the 
Vh domains selected by TAPE exhibited a higher ratio of soluble 
to insoluble protein than the previously characterized Vh domains 
described above, i.e., Vh2 (HuCAL), Vh3 (HuGAL), Vh6 
(HuCAL), Vh3 (DP47d), and HEL4 (Figure 3A). 



An artificial library comprising 25 individual Vh domains, 
either selected from the germ-line library or previously character- 
ized Vh domains (HEL4, DP47d, HuGal Vh3, and HuCal Vh2) 
were subjected to TAPE. Only one clone, MG2xl, grew out at the 
third round of TAPE. This clone was used as the backbone for the 
frame-mutation library with selected mutation sites, described 
below. 

Screening of the frame-nnutation library of l\/lG2xl via 
TAPE 

To confer additive solubility and stability to MG2xl, combina- 
torial mutations were introduced into seven specific sites of 
MG2xl to generate the MG2xl frame -mutation library. The 
number of distinct clones in the library was 1.4x10^, which covers 
all the possible combinations of mutations with NNK degeneration 
codon (theoretically, 6.4x10^ combinations). The selected muta- 
tion sites are distributed over the CDRHl (S35), frame 2 (Q39, 
L45, and W47), and the GDRH2 (A50, Y58, and A60) with the 
kabat numbering system (Figure 4A, residues in red). These sites 
were selected by referring to the crystal structure of MG2x 1 (PDB 
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A B 




Figure 4. Rationale for designing of the combinatorial frame-mutation library. (A) Positions chosen for randomization based on the crystal 
structure of MG2x1. Residues are numbered according to the Kabat scheme for the Vh sequence. (B) Representative sequences (IV1G8-4, MG8-14, 
IV1G4-13, and MG8-6) selected from the IVlG2x1 frame-mutation library by TAPE were aligned with the original IVlG2x1 sequence. Mutation sites in the 
sequence of IVlG2x1 are shown as bold dots. All mutations were introduced using degenerate codons (NNK), except that serine (S) 35 was replaced by 
glycine (G). X represents all amino acids. At positions 50 and 58, the mutations converged primarily onto tryptophan, indicated by dashed boxes. 
doi:1 0.1 371 /journal.pone.00981 78.g004 



ID: 3ZHK) to identify amino acids that stretch their side chains 
outward from the surface. Also, all of these sites are located in the 
P-sheet structure away from the flexible loop of the CDRs. The 
frame-mutation library of MG2xl was screened by TAPE, with 
the concentration of ampicillin increased (50, 100, 250, and 
500 |ig/ml) in successive rounds. After the final round of TAPE, 
41 clones were randomly selected for sequencing of their Vh 
domains. Changes at positions 50 and 58 (Kabat scheme) were 
biased toward tryptophan (W): alanine (A) at position 50 was 
replaced by W in 39% (16/41) of the clones, and tyrosine (Y) at 
position 58 was replaced by W in 58% (24/41) of the clones 
(Table 2). The other mutation sites were not particularly biased. 
Sequence alignment of the selected Vh domains after TAPE 
revealed the biased amino acids at positions 50 and 58 (Figure 4B, 
dashed box). Based on the biased mutation frequencies at positions 
50 and 58, we generated a MG8-14 mutant in which leucine (L) at 
position 50 was replaced with W (MG8-14 [E50W1) for further 
analyses of its physicochemical properties. 

Soluble expression level and thermodynannic stability are 
correlated in Vh domains selected by TAPE 

Among the hits obtained from the combinatorial frame- 
mutation library of MG2xl, 23 unique sequences were selected 
from the final round of TAPE. Most of the selected Vh domains 
were expressed as soluble proteins. In particular, MG8-14, MG2- 
55, MG4-5, MG-4-13, MG8-4, and MG8-6 were expressed 
exclusively in their soluble forms (Figure 5). A previous study using 
the Tat pathway to express a protein fused to an antibiotic 
resistance marker showed that the ability to confer growth was 
correlated to both the solubility profile and the molecular weight 
of the protein [26]. The thermodynamic stabilities of the Vh 
domains selected from the naive human Vh library by TAPE were 
higher than those of wild-type Vh3 domains. The melting 
temperatures (T^^) of the selected germ-line Vh domains were 
55.6-65.2°C, whereas the Tj^ of the randomly chosen Vh 
domains from the germ-line library were generally below 50°C, 
e.g., 46.5 °C for Vh3-15 (Figure 6A). Among the selected germ- 
line Vh domains, MG2xl had the highest T^^. Furthermore, the 
Tjn of Vh domains selected from the combinatorial frame- 
mutation library of MG2xl (65.2-77.5°C) were significantly 
higher than that of the parental Vh (MG2x1) (Figure 6B). The 
thermodynamic stabilities of the engineered Vh domains identified 



in this study were generally higher than that of HEL4, which was 
selected from a combinatorial CDR3 library based on Dp47d by 
heat-resistant phage display selection [24]. 

Selected Vh domains fold autonomously after 
denaturation 

Proteins exist in thermodynamic equilibrium between their 
folded and unfolded states. Hence, unstable proteins are much 
more vulnerable to heat and pH disturbance because exposure of 
their hydrophobic core during occupancy of the unfolded state 
promotes aggregation. Many Vh3 family domains are soluble and 
aggregation-resistant. However, once these proteins are dena- 
tured, they never refold into their native conformation. This was 
the case for all Vh domains selected from the germ-line library in 
this study, including MG2x 1 . However, some of the Vh domains 
selected from the frame-mutation library of MG2xl by TAPE 
were folded reversibly after denaturation. Far-UV circular 
dichroism (CD) spectra suggested that MG8-14 could be reversibly 
folded after denaturation heating at 85°C (Figure 7C), whereas the 
parental Vh domain, MG2xl, could not (Figure 7 A). Further- 
more, the modified MG8-14 [E50W] had a perfect renaturation 
profile (Figure 7D). MG8-6 had the highest T^, but could not 
refold after denaturation (Figure 7B). The recovery yield for the 
selected Vh after heat denaturation reached 95% (Table 3), in 
contrast to that of the parental sequence (MG2xl), which was 
below 5%. 

Structural features underlying the superior biophysical 
properties of selected Vh domains 

Superimposition of crystal structures of the parental Vh, 
MG2xl (PDB ID: 3ZHK), and the modified Vh domains MG8- 
4 (PDB ID: 3ZHD) and MG8-14 (PDB ID: 3ZHL) revealed that 
these proteins have the same overall topology: two P-sheets 
connected by a disulfide bond between C22 and C96, yielding a 
typical (3-sandwich lectin fold structure (Figure 8 A and Table S2). 
The random amino acid changes introduced in the combinatorial 
frame-mutation library of MG2xl are positioned on the (3-strand 
that forms the sandwich scaffolds; in particular, they are located on 
the side of the sandwich corresponding to the hydrophobic 
interface region between heavy and light chains in the typical Fab 
complex arrangement (Figure 8B). Mutations in MG8-4 and 
MG8-14 altered the conformation of the flexible CDRH3 loop. 
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Figure 5. Soluble expression level of screened Vh domains. SDS-PAGE of soluble (A) and insoluble (B) fractions of E coli expressing each Vh 
domains selected from the frame-mutation library of IVlG2x1 were loaded as follows: 1, Vhh (camel single-domain antibody); 2, HEL4; 3, IVlG2x1; 4, 
IV1G8-14; 5, IV1G2-47; 6, IV1G2-55; 7, IV1G2-57; 8, IV1G2-59; 9, IV1G4-2; 10, IV1G4-5; 1 1, IV1G4-6; 12, IV1G4-7; 13, IV1G4-12; 14, IV1G4-13; 15, IV1G4-17; 16, MG4-20; 
17, MG4-28; 18, MG4-32; 19, IV1G4-33; 20, IViG8-4; 21, IV1G8-5; 22, IV1G8-6; 23, IV1G8-8; 24, MG8-1 1; 25, MG8-12; 26, IV1G8-13. Lanes labeled 'MW' contained 
protein size markers (10 and 15 kD). U indicates fractions from a culture with no isopropyipD-1-thiogalactopyranoside. 
doi:1 0.1 371 /journal.pone.00981 78.g005 



whereas the CDRHl and CDRH2 loops remained in their 
original conformations (Figure 8C and Table S2). 

Surface electrostatic calculations revealed that MG8-4 and 
MG8-14 exhibited increased partial negative charge next to the 
hydrophobic patch, possibly due to the introduction of a charged 
group such as aspartate (D) at position 60, whereas substantial 
positive charge was detected next to the exposed surface of the 
heavy chain in all three structures (MG2xl, MG8-4, and MG8-14) 
(Figure 9A and 9B). The solvation energies of MG8-4 and MG8- 
14 (—1166.8 kcal/mol and —1153.4 kcal/mol, respectively) were 
significantly lower than that of MG2xl (— 1047.5 kcal/mol), 
suggesting that the charged residues on the surface contribute to 
the solvation energy, and hence the solubility, of the protein. 
Analysis of surface features revealed an significantly increased 
number of hydrogen bonds between side chains of the residues of 
MG8-4 and MG8-14 (26 and 39, respectively), whereas only 19 
hydrogen bonds were observed in MG2xl, indicating that the 
architecture of MG8-14 is more stable than that of MG2xl. In 



addition, the structures of MG8-4 and MG8-14 contained more 
charge-charge interactions (8 and 9, respectively) than the 
structure of MG2xl (5) (Table 4). 

MG2xl contains a prominent pocket comprising residues W47, 
A50, and Y58, with a cavity area of 32 and a volume of 
19.5 A^, centered at residue A50 (Figure 8B and Figure 9C). 
Sequence analysis of Vh domains selected by TAPE revealed that 
two positions in the framework, A50 and Y58, were consistently 
biased toward W. Residue A50 was also replaced by leucine (L) or 
W in representative selected Vh domains such as MG8-4, MG8- 
14, MG8-6, and MG4-13, suggesting that replacement of this 
residue with a bulky side chain is related to the stability of the 
molecule. The structural model of the modified MG8-14 [L50W1 
suggests that the cavity is filled with a triad bulky side chains 
consisting of 50 W, W47, and W58 (Figure 9D). Accordingly, the 
modified MG8-14 [L50W] exhibited high thermodynamic stabil- 
ity as well as reversible folding after heat denaturation (Figure 7D). 



A B 




Temperature (°C) Temperature (°C) 

Figure 6. Thermodynamic stability. (A) Representative Vh domains selected from the germ4ine library. (B) Representative Vh domains selected 
from the IVlG2x1 frame-mutation library. The black bold line indicates the profile of the parental Vh, IV1G2x1, prior to mutation. Folding fraction was 
converted from the temperature-scouting CD profile at a fixed wavelength (230 nm). 
doi:10.1371/journal.pone.0098178.g006 
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Figure 7. Far-UV CD spectra for the detection of reversible folding. (A) IVlG2x1. (B) IV1G8-6. (C) IV1G8-14. (D) Modified IVIG8-14 [L50W]. Black 
lines indicate profiles for Vh in native state at 25°C; red lines indicate the profile for Vh denatured at 85°C; and green lines indicate profiles for Vh 
renatured at 25°C 

doi:10.1371/journal.pone.0098178.g007 



Validation of the connbinatorial CDRH synthetic library 
built on IVIGS-M scaffold 

To confirm the effects of CDR variation on the stability of Vh 
scaffold, we examined the soluble expression level of Vh domains 
containing CDRH3 regions of various lengths (7-13 amino acids), 



using a combinatorial CDRH synthetic library based on MG8-14. 
Eight or nine different sequences of each length were randomly 
selected and expressed in E. coli; 64 of 73 (88%) Vh clones were 
expressed in soluble form. In addition, 1 1 different sequences from 
a rational mutation library (CDRH 3 length fixed and seven 
positions of CDRH 1, 2 and 3 of MG8-14 were randomized) were 



Table 3. The recovery yields of selected Vh after thermal stress. Data are means and standard deviation for three independent 
treatment of heat denaturation within the same sample. 







Initial A280 nm^ 


Final A280 nm'' 


Recovery yield (%)*^ 


MG2x1 (parent) 


0.964±0.011 


0.041 ±0.006 


3.843 ±0.662 


MG8-4 


0.714±0.013 


0.653 ±0.049 


91.389±5.241 


MG8-14 


0.726±0.010 


0.643 ±0.062 


88.458± 7.370 


MG8-14 [L50W] 


0.722±0.011 


0.695 ±0.022 


96.257±2.305 


HEL4 


0.964± 0.011 


0.878±0.032 


91 .097 ±2.348 



^Absorbance at 280 nm at 25°C. 

"^Absorbance at 280 nm after heating (85°C) followed by cooling (25°C). Aggregates after heating were removed by centrifugation. 
^The recovery yield was defined as the fraction of soluble Vh remaining after heating at the denaturation temperature (85°C). 
doi:1 0.1 371/journal.pone.00981 78.t003 
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Light Cliain Heavy Cliain 



Figure 8. Structures of IVIG2x1, IVIG8-4, and IVIG8-14. (A) Structure of IVlG2x1 with CDRH1 (yellow), CDRH2 (red), and CDRH3 (blue). Mutation 
sites for the IVlG2x1 mutation library are indicated as sticks (magenta). (B) Antibody light chain, in surface rendering (cyan), is shown to highlight the 
relative locations of CDR H1-H3 and the mutation sites in IVlG2x1 (magenta). The circle indicates the cavity area. (C) Superposition of three Vh 
domains (IVlG2x1, IV1G8-4, and IV1G8-14), showing the variation in the loop in the CDRH3 region (blue). 
doi:1 0.1 371 /journal. pone.00981 78.g008 



randomly tested; all of the test sequences were expressed in soluble 
form in the cytoplasm of E. coli under reducing conditions 
(Figure 10). Thus, aggregation was infrequently occurred regard- 
less of CDR alteration in a combinatorial CDRH library that used 
the MG8-14 framework as a scaffold. 



A B 




C D 




Figure 9. Surface features of IVIG2x1, IVIG8-14, and modified 
IVIG8-14 [L50W]. (A) Electrostatic charge distribution on the solvent- 
accessible surface of IVlG2x1 (red, -5 k; blue, +5 k). (B) Electrostatic 
charge distribution on the solvent-accessible surface of IV1G8-14. (C) 
Surface representation of IVlG2x1 showing the prominent cavity around 
residues 50 and 58. CDR regions are colored in yellow (CDRH1), red 
(CDRH2), and blue (CDRH3). Mutation sites are colored in magenta to 
highlight the cavity. (D) Surface representation of the structural model 
of the modified IV1G8-14 [L50W], in which L50 is replaced by W. 
doi:1 0.1 371 /journal. pone.00981 78.g009 



Humoral response to l\/lG2x1 and I\/1G8-14 in mouse 

To test the humoral immune response of the selected VH 
domains, BALB/c mice were subjected to repeated immunization 
with selected Vh domains, administered by various routes. 
Antibody against MG2xl was undetectable after nine intravenous 
injections of 10 |ig protein over 9 weeks (Figure 1 lA). Further- 
more, there was no antibody-boosting response, even when 
injections included Freund's Complete Adjuvant (CFA), in four 
of six mice at week 9 (Figure 1 lA). In the case of MG8-14, there 
was no detectable anti-MG8-14 antibody until week 6, although a 
mild antibody response was present in half of the tested mice at 
week 9 (Figure 1 lA). On the other hand, a camel single-domain 
antibody, Vhh [23], was more immunogenic than MG2xl and 
MG8-14, as shown by the high titer after the first injection (with 
CFA) at week 3 (Figure IIB). Intramuscular and subcutaneous 
injection of 1 |ig MG8-14 resulted in no antibody response against 
MG8-14 throughout a 10-week course of immunization 
(Figure IID), whereas Vhh injection caused an increase in 
antibody titer starting at week 6 (Figure llE). When mice were 
injected intramuscularly with 10 |ig MG8-14, anti-MG8-14 
antibody was elicited moderately at week 10 in only one of six 
mice. Subcutaneous injection of 10 |ig MG8-14 elicited no 
antibody response until the fourth injection at week 6; moderate 
levels of anti-MG8-14 antibody were detectable after this time 
point (Figure IID). Among mice subjected to intramuscular and 
subcutaneous injection of Vhh, most animals exhibited an anti- 
Vhh antibody response at week 4, immediately after the second 
injection (Figure E). 

Discussion 

The external diameter of the TatABC complex is around 
160 A, but its pore is relatively small [27]. Variations in complex 
size may result in variations in pore size, influencing the 
compatibility of each complex with differently sized Tat substrate 
proteins [28]. The capacity of the Tat system to export proteins via 
membrane-bound TatABC complexes varies among species of 
Gram-negative bacteria. For example, the A. tumefaciens TatABC 
complex is capable of exporting large (>80 kD) proteins [29], 
whereas in E. coli, the correlation between protein folding and 
export to the periplasm via the Tat pathway is poorer for proteins 
larger than 30 kDa than proteins of a lower molecular weight [26]. 
The molecular weight of the Vh domain is around 14 kDa; 
therefore, this group of proteins was predicted to be compatible 
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Table 4. Analyses of the structural features for IVlG2x1, l\/lG8-4, and IV1G8-14. 







IVIG2x1 


IVIG8-4 


IVIG8-14 


IVIG8-14 [L50W] 


Solvent accessible surface (A^) 


5958.6 


6009.2 


6393.5 


6141.6 


Aromatic aromatic interaction 


4 


4 


4 


4 


Main chain-sidechain hydrogen bonds 


48 


49 


48 


48 


Side chain-side chain hydrogen bonds 


19 


26 


39 


35 


Ionic interactions 


5 


9 


8 


8 


Solvation energy (kcal/mol) 


-1047.5 


-1166.8 


-1153.4 


-1155.9 



doi:1 0.1 371/journal.pone.00981 78.t004 



with the Tat pathway ofE. coli. Consistent with this expectation, in 
this study, the export of Vh in vivo corresponded well with 
properties related to protein stability in vitro. Accordingly, because 
the Vh3 family is the most soluble of the seven Vh families (VhI- 
7), the Vh3 family was enriched via TAPE (Table 1) in a screen of 
a human germ-line library. This suggests that selection was driven 
by the function of the Tat pathway, which serves as a 'molecular 
sieve' in vivo as already discussed in many previous works 
[28,30,31]. 

We tried to compare ampicillin resistance of Vh variants to the 
other variants by using visual measurement. For example, spot 
analyses of serial diluents of the culture containing ampicillin [22] 
was not sensitive to demonstrate the direct comparison of their 
resistance in this study (data not shown). To overcome this 
limitation, we performed a head-to-head competition of the 
ampicilline resistance among the 25 germ-line Vh domains (the 
artificial library) with the third round of selection in liquid culture. 
This experiment resulted in MG2xl as a sole survivor, a Vh3 
family member (Vh3-23), which was used for the backbone of a 
frame-mutation library. This library was then subjected to another 
round of TAPE, with the goal of improving the physicochemical 



properties of this protein. Considering that MG2xl is already 
relatively soluble and stable, one might expect only a marginal 
improvement from directed evolution via TAPE. However, 
subjecting the frame-mutation library to selection resulted in a 
significant improvement in folding-related properties. 

Studies of the protein folding quality control mechanism of the 
E. coli Tat pathway have primarily focused on the tendency of 
proteins to be expressed in soluble form [19,20]. However, the 
correlation between the selection via Tat-mediated protein folding 
and increases in the thermodynamic stabilities of proteins of 
interest has not been clearly demonstrated. In this study, we 
showed that both protein expression in soluble form and properties 
related to thermodynamic stability were clearly improved by Tat- 
associated screening. Foit et al. also demonstrated that antibiotic 
resistance bestowed by the tripartite fusion protein is correlated 
with stability in vivo and thermodynamic stability in vitro [32]. 
Although both methods use the same reporter gene, i.e., BLA, the 
protein folding occurs in a different environment, i.e., periplasm 
for the tripartite system and cytoplasm for TAPE. With the 
reduced condition of TAPE for protein folding, some of the 
evolved Vh was capable of autonomous refolding over repeated 
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Figure 10. Validation of the combinatorial CDRH synthetic library built on IVIG8-14 scaffold. SDS-PAGE of soluble and insoluble fractions 
of E coli expressing Vh domains selected randomly from the combinatorial CDRH3 synthetic libraries. Coomassie-stained gels are aligned by lane 
numbers (columns) and amino acid lengths of CDRH3 (rows). Images depict the region of the gel corresponding to the size of Vh. Some images were 
combined with separate gels for the purpose of alignment (indicating with a dividing bar between gels). 'MW indicates the protein size marker 
corresponding to a molecular weight of 15 kD. 
doi:1 0.1 371 /journal.pone.00981 78.g01 0 
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Figure 11. Humoral immune response of BALB/c mice. Six mice (per group) were received multiple intravenous injections of PBS (P), IVIG2x1 
(A), MG2x1 plus CFA (A), IV1G8-14 (B), IV1G8-14 plus CFA (B), Vhh (C), or Vhh plus CFA (C) at week 3 (w3), week 6 (w6), and week 9 (w9). Six mice (per 
group) were received multiple injections of 1 |ig IV1G8-14 intramuscularly (im, D), 1 |ig IV1G8-14 subcutaneously (sc, D), 10 |ag IV1G8-14 intramuscularly 
(im, D), 10 |ag IV1G8-14 subcutaneously (sc, D), 1 |ag Vhh intramuscularly (im, E), 1 |ag Vhh subcutaneously (sc, E), 10 [ig Vhh intramuscularly (im, E), or 
10 [ig Vhh subcutaneously (sc, E) at week 2 (w2), week 6 (w6), and week 10 (wlO). 
doi:1 0.1 371/journal.pone.00981 78.g01 1 



cycles of heating and cooling. More reversible refolding and a 
higher recovery yield should increase resistance to mechanical or 
thermal stresses during the purification process, as well as improve 
long-term storage due to the low exposure rate of hydrophobic 
patches [33]. 

Christ et al. demonstrated that the frequency of aggregation- 
resistant domain was about 80% in the repertoire after heat- 
cooling selection and about 7 1 % in the large aggregation-resistant 
repertoire generated by combinatorial ligation of CDR-encoding 
regions [34]. In this study, the frequency of aggregation-resistant 
Vh domains in combinatorial CDRH3 repertoires with a fixed 
scaffold (MG8-14) screened by TAPE was 88%, regardless of the 
length of the CDRH3 region (Figure 10). With the exception of the 
CDRH3 region, the crystal structure of MG8-4 and MG8-14 
superimposed closely with the parental Vh, MG2x1, despite 
containing mutations in the frame region (Figure 8C). In addition, 
the atomic mobility of MG8-14 at residue L50 had the lowest 
observed B-factor (32), whereas the average B-factor was 43.3. 
These observations suggest that the core of this region is very rigid, 
but is still capable of accommodating various structures of 
CDRH3. As framework and CDR regions of the scaffold are 
conformational, a stability-functional tradeoffs are fully anticipated 
when the stability-enhancing mutation are introduced to the given 
functional protein, for example, scFv [20]. In contrast, we 
screened out the stable Vh scaffold first and then generated the 
combinatorial CDRH synthetic library to give functionality later. 
As the affinity of Vh domains we screened from the library against 
several antigens, including HER3, TNF-cx, and albumin were all 
sub-nanomolar range, we can expect that the problems on a 
stability-functional tradeoffs would be a minimal when we screen 
the functional Vh domains with this quality of the library (data not 
shown). 



The modified MG8-14 [L50W] contains three W residues that 
fill a large cavity of MG2xl near the Vh/Vl interface. Van der 
Waals interactions in this region would enhance stable architec- 
ture, allowing reversible folding of the antibody during the 
refolding process after denaturation. Within the cavity structure, 
high temperature leads to thermal destabilization as a result of 
water permeation [35,36]. Therefore, water molecules in the 
hydrophobic cavity of MG2xl may directly affect thermal 
resilience and promote structural perturbation. Taken together, 
these data demonstrate that surface properties are important 
factors in selection of single-domain antibodies with high solubility 
and thermodynamic stability. 

Vh domains that had been selected by heat-denatured phage 
display from a combinatorial CDR repertoire exhibited an 
enrichment of certain amino acids at several positions within the 
CDR regions, including glycine at position 35 and glutamate at 
position 32 [37]. Our differentiated in vivo selection strategy, using 
the Tat pathway in E. coli, resulted in a unique preference for 
tryptophan at positions 50 and 58, leading to the creation of a 
bulky ring structure. We believe that this preference helps Vh to 
acquire a stable conformation, preventing structural perturbation 
during folding and refolding. 

MG2x 1 contains a negatively charged amino acid, aspartic acid 
(D) at position 61, which was previously identified as a 
determinant of protein aggregation and solubility [38]. In MG8- 
4 and MG8-14, which were selected from the MG2xl frame- 
mutation library, D was incorporated consecutively at positions 60 
and 61, significantly increasing the net negative charge. This 
preference for adjacent D residues has also been observed in other 
protein stability screens of combinatorial CDR repertoires. For 
example, positions 32 and 33 of Vh and positions 52 and 53 of Vl 
are determinants for aggregation resistance [37]. 
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One important safety issue in protein therapeutics is related to 
immunogenicity. Many previous studies suggest that formation of 
sub-visible aggregates exerts a major influence on the humoral 
immune response [39,40]. In this work, the antibody titer 
represents both the quantity and quahty (afiinity) of IgG that is 
specific to certain Vh domain. Although we cannot discriminate 
which factor affects the titer more than the other does, it is obvious 
that the mouse immune system hardly responded to the selected 
Vh domains even with GFA, compared to Vhh shown in 
Figure 1 1 . This may be attributed to a favorable folding properties 
of the selected Vh domains preventing aggregation, as we 
employed Tat-associated protein folding fitness filter. 

Database access codes 

The atomic coordinates and structure factors have been 
deposited in the Protein Data Bank, www.pdb.org (PDB ID: 
3ZHL, 3ZHK and 3ZHD) 

Supporting Information 

Figure SI SDS-PAGE of the purified Vh domains used 
for the measurement of Far-UV CD spectra. (A) Vh 
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